Search CORE

11 research outputs found

A Study of Energy and Locality Effects using Space-filling Curves

Author: Jahre Magnus
Meyer Jan Christian
Reissmann Nico
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/06/2016
Field of study

The cost of energy is becoming an increasingly important driver for the operating cost of HPC systems, adding yet another facet to the challenge of producing efficient code. In this paper, we investigate the energy implications of trading computation for locality using Hilbert and Morton space-filling curves with dense matrix-matrix multiplication. The advantage of these curves is that they exhibit an inherent tiling effect without requiring specific architecture tuning. By accessing the matrices in the order determined by the space-filling curves, we can trade computation for locality. The index computation overhead of the Morton curve is found to be balanced against its locality and energy efficiency, while the overhead of the Hilbert curve outweighs its improvements on our test system.Comment: Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops (IPDPSW

arXiv.org e-Print Archive

CiteSeerX

RVSDG: An Intermediate Representation for Optimizing Compilers

Author: Bahmann Helge
Meyer Jan Christian
Reissmann Nico
Själander Magnus
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Intermediate Representations (IRs) are central to optimizing compilers as the way the program is represented may enhance or limit analyses and transformations. Suitable IRs focus on exposing the most relevant information and establish invariants that different compiler passes can rely on. While control-flow centric IRs appear to be a natural fit for imperative programming languages, analyses required by compilers have increasingly shifted to understand data dependencies and work at multiple abstraction layers at the same time. This is partially evidenced in recent developments such as the MLIR proposed by Google. However, rigorous use of data flow centric IRs in general purpose compilers has not been evaluated for feasibility and usability as previous works provide no practical implementations. We present the Regionalized Value State Dependence Graph (RVSDG) IR for optimizing compilers. The RVSDG is a data flow centric IR where nodes represent computations, edges represent computational dependencies, and regions capture the hierarchical structure of programs. It represents programs in demand-dependence form, implicitly supports structured control flow, and models entire programs within a single IR. We provide a complete specification of the RVSDG, construction and destruction methods, as well as exemplify its utility by presenting Dead Node and Common Node Elimination optimizations. We implemented a prototype compiler and evaluate it in terms of performance, code size, compilation time, and representational overhead. Our results indicate that the RVSDG can serve as a competitive IR in optimizing compilers while reducing complexity

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

Validation of a Dutch version of the Geriatric Oral Health Assessment Index (GOHAI-NL) in care-dependent and care-independent older people

Author: A Murariu
A Zenthofer
AJ Hassel
AJ Hassel
AJ Hassel
C Hagglin
CA McHorney
D Locker
D Locker
D Niesten
Dick Witter
DJ Zuluaga
DL Wolfe
Dominique Niesten
DR Reissmann
E Rodakowska
EF Juniper
Ewald Bronkhorst
FB Andrade de
GD Slade
GH Verrips
H Kalsbeek
J Fleiss
J Swoboda
JC Nunnally
JF Hair
JG Ponteretto
KA Atchison
KA Atchison
KA Atchison
KH Lee
L Halvorsrud
L Sischo
LA Clark
LL Lim
M Naito
MA Atieh
MC Wong
MI MacEntee
N Osta El
Nico Creugers
Organization WWH
PM Fayers
R Likert
RK Saarela
S Daradkeh
S Ergul
S Tubert-Jeannin
WO Bearden
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Soil macrofauna communities in Brazilian land-use systems

Author: Alessandra Santos
Amarildo Pasini
Amilton Baggio
Ana Conrado
Antonio Garcia
Aníbal de Moraes
Beatriz Corrêa-Ferreira
Benjamin Pey
Carlos Eduardo Seoane
Carlos Peres
Carlos Reissmann
Carolina Brandani
Cintia Niva
Clara Peña-Venegas
Daiane Nunes
Dionisio Gazziero
Edinelson Neves
Elena Velásquez
Eleno Torres
Elma Oliveira
Elodie Silva
George Brown
Gilherme Cardoso
Herlon Nadolny
Jean-Paul Laclau
Jean-Pierre Bouillet
José Gonçalves
Julio dos Santos
Jérôme Mathieu
Klaus Sautter
Lenita Oliveira
Lilianne Bruz
Lina Clasen
Lucília Parron
Luis Froufe
Marcus Cremonesi
Mariangela Hungria
Marie Bartz
Mauricio Zagatto
Miguel Cooper
Nico Eisenhauer
Norton Benito
Odair Alberton
Odilon Saraiva
Osvaldino Brandão Júnior
Patrick Lavelle
Paulo Galerani
Quentin Gabriac
Rafaela Dudas
Ranieri Paula
Raul César
Ricardo Viani
Samuel James
Talita Ferreira
Thiago Campos
Thibaud Decaëns
Vagner da Silva
Vanesca Korasaki
Wagner Maschio
Wilian Demetrio
Publication venue: Pensoft Publishers
Publication date: 01/01/2024
Field of study

Soil animal communities include more than 40 higher-order taxa, representing over 23% of all described species. These animals have a wide range of feeding sources and contribute to several important soil functions and ecosystem services. Although many studies have assessed macroinvertebrate communities in Brazil, few of them have been published in journals and even fewer have made the data openly available for consultation and further use. As part of ongoing efforts to synthesise the global soil macrofauna communities and to increase the amount of openly-accessible data in GBIF and other repositories related to soil biodiversity, the present paper provides links to 29 soil macroinvertebrate datasets covering 42 soil fauna taxa, collected in various land-use systems in Brazil. A total of 83,085 georeferenced occurrences of these taxa are presented, based on quantitative estimates performed using a standardised sampling method commonly adopted worldwide to collect soil macrofauna populations, i.e. the TSBF (Tropical Soil Biology and Fertility Programme) protocol. This consists of digging soil monoliths of 25 x 25 cm area, with handsorting of the macroinvertebrates visible to the naked eye from the surface litter and from within the soil, typically in the upper 0-20 cm layer (but sometimes shallower, i.e. top 0-10 cm or deeper to 0-40 cm, depending on the site). The land-use systems included anthropogenic sites managed with agricultural systems (e.g. pastures, annual and perennial crops, agroforestry), as well as planted forests and native vegetation located mostly in the southern Brazilian State of Paraná (96 sites), with a few additional sites in the neighbouring states of São Paulo (21 sites) and Santa Catarina (five sites). Important metadata on soil properties, particularly soil chemical parameters (mainly pH, C, P, Ca, K, Mg, Al contents, exchangeable acidity, Cation Exchange Capacity, Base Saturation and, infrequently, total N), particle size distribution (mainly % sand, silt and clay) and, infrequently, soil moisture and bulk density, as well as on human management practices (land use and vegetation cover) are provided. These data will be particularly useful for those interested in estimating land-use change impacts on soil biodiversity and its implications for below-ground foodwebs, ecosystem functioning and ecosystem service delivery.Quantitative estimates are provided for 42 soil animal taxa, for two biodiversity hotspots: the Brazilian Atlantic Forest and Cerrado biomes. Data are provided at the individual monolith level, representing sampling events ranging from February 2001 up to September 2016 in 122 sampling sites and over 1800 samples, for a total of 83,085 ocurrences

Directory of Open Access Journals

ARPHA OAI-PMH Endpoint

Principles, Techniques, and Tools for Explicit and Automatic Parallelization

Author: Reissmann Nico
Publication venue: 'Norwegian University of Science and Technology (NTNU) Library'
Publication date: 01/01/2019
Field of study

The end of Dennard scaling also brought an end to frequency scaling as a means to improve performance. Chip manufacturers had to abandon frequency and superscalar scaling as processors became increasingly power constrained. An architecture’s power budget became the limiting factor to performance gains, and computations had to be performed more energy-efficiently. Designers turned to chip multiprocessors (CMPs) and developers began to employ specialized architectures, such as Graphics Processing Units (GPUs) and Field ProgrammableGate Arrays (FPGAs), to further improve performance while meeting the power envelope. The exploitation of parallelism in an energyefficient manner became the primary way forward. Until the end of Dennard scaling, programs experienced transparent performance gains with every new processor generation. However, CMPs, GPUs, and FPGAs rely on the static extraction of parallelism to improve performance, and programs need to be modified to take advantage of these architectures. Thus, performance gains are no longer achieved transparently, and developers and tools are forced to face new, as well as long-neglected challenges in program parallelization. These challenges include the detection and encoding of potential parallelism in automatic approaches, application portability issues on GPUs, and performance portability issues on CMPs. It is essential to address these challenges, as the continuous increase in computer performance now solely relies on the exploitation of parallelism. This thesis consists of three parts, each addressing one of the aforementioned challenges in program parallelization. The first part addresses the detection and encoding of potential parallelism in automatic approaches. It presents the Regionalized Value State Dependence Graph (RVSDG) as an alternative intermediate representation for optimizing and parallelizing compilers. The RVSDG exposes the hierarchical structure of programs and explicitly models the dependencies between computations, permitting the explicit encoding of concurrent operations and program structures, such as conditionals, loops, and functions. This helps to expose the inherent parallelism in programs and its structures by employing well-known methods for the extraction of instruction level parallelism. The second part addresses application portability issues on GPUs. A GPU’s specialized architecture is optimized for highly regular data-parallel applications, but compromises program performance for workloads with irregular control flow, potentially leading to redundant code execution. We propose a control flow restructuring method to effectively eliminate repeated code execution on GPUs and potentially improve performance. The third part addresses performance portability on CMPs. This issue arises as developers overfit their application to a specific architecture, which results in suboptimal performance for different program inputs or different architectures. We improve performance analysis for OpenMP programs by addressing the scalability challenges of the grain graph visualization method. We present an aggregation method for grain graphs that hierarchically groups related nodes into a single node. This aggregated graph can then be navigated by progressively uncovering nodes with performance issues, while hiding unrelated regions of the graph. This enhances productivity by enabling developers to understand performance problems of highly-parallel OpenMP programs more easily. The insights and techniques developed by addressing these three challenges may result in improved methods and tools for the exploitation of parallelism. The RVSDG is a promising IR for parallelizing compilers, as it permits the encoding of concurrent computations. The grain graph offers a familiar structural view to developers along with the performance issues of a particular program. In the future, it is necessary to cast these ideas into mature tools to make them applicable in practice and foster further research

NORA - Norwegian Open Research Archives

Diagnosing Highly-Parallel OpenMP Programs with Aggregated Grain Graphs

Author: Muddukrishna Ananya
Reissmann Nico
Publication venue: Springer Verlag
Publication date: 01/01/2018
Field of study

Grain graphs simplify OpenMP performance analysis by visualizing performance problems from a fork-join perspective that is familiar to programmers. However, when programmers decide to expose a high amount of parallelism by creating thousands of task and parallel for-loop chunk instances, the resulting grain graph becomes large and tedious to understand. We present an aggregation method that hierarchically groups related nodes together to reduce grain graphs of any size to one single node. This aggregated graph is then navigated by progressively uncovering groups and following visual clues that guide programmers towards problems while hiding non-problematic regions. Our approach enhances productivity by enabling programmers to understand problems in highly-parallel OpenMP programs with less effort than before

NORA - Norwegian Open Research Archives

RVSDG: An intermediate representation for optimizing compilers

Author: Bahmann Helge
Meyer Jan Christian
Reissmann Nico
Själander Magnus
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 01/01/2020
Field of study

NORA - Norwegian Open Research Archives

Towards Fine-Grained Dynamic Tuning of HPC Applications on Modern Multi-Core Architectures

Author: Hackenberg Daniel
Kjeldsberg Per Gunnar
Langguth Johannes
Raknes Espen Birger
Reissmann Nico
Schöne Robert
Sourouri Mohammed
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

There is a consensus that exascale systems should operate within a power envelope of 20MW. Consequently, energy conservation is still considered as the most crucial constraint if such systems are to be realized. So far, most research on this topic has focused on strategies such as power capping and dynamic power management. Although these approaches can reduce power consumption, we believe that they might not be sufficient to reach the exascale energy-efficiency goals. Hence, we aim to adopt techniques from embedded systems, where energy-efficiency has always been the fundamental objective. A successful energy-saving technique used in embedded systems is to integrate fine-grained autotuning with dynamic voltage and frequency scaling. In this paper, we apply a similar technique to a real-world HPC application. Our experimental results on a HPC cluster indicate that such an approach can save up to 19% of energy compared to the baseline configuration, with negligible performance loss

NORA - Norwegian Open Research Archives

Perfect Reconstructability of Control Flow from Demand Dependence Graphs

Author: Erosa Ana
Havlak Paul
Helge Bahmann
Jan Christian Meyer
Johnson Neil
Magnus Jahre
Nico Reissmann
Paleczny Michael
Stanier James
Zhang Fubo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref